Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for compression of the oplog in case of mergeable operations. #100

Open
wants to merge 36 commits into
base: main
Choose a base branch
from

Conversation

Wulf0x67E7
Copy link

@Wulf0x67E7 Wulf0x67E7 commented Sep 14, 2021

What

Adds one associated const and fn each to Absorb, as well the new Type TryCompressResult returned by that fn. Together they allow WriteHandle::extend (and by extension WriteHandle::append) to optimize the op-log by combining certain operations into one.

Why

Reading/Watching about left-right/evmap it was mentioned that on top of the base doubling (unless carefully deduplicated) of memory the op-log can get quite out of hand if only infrequently published. That got me thinking about how, in a toy example of a simple map, you could combine set-ops with the same key to be the last one, merge certain modify-ops (like add), and completely consume the op-log when emitting a clear-op. Then I looked at the source and found it quite straightforward to implement. So I did (in about six hours).

How

When compression is enabled WriteHandle::extend reverse-walks the fully unpublished portion of the op-log while attempting to combine ops. It does this by calling Absorb::try_compress and either:

  • skipping an empty entry
  • leaving behind an empty entry and continuing on when it returned TryCompressResult::Compressed
  • skipping an entry and continuing on when it returned TryCompressResult::Independent
  • stopping and inserting the op when it returned TryCompressResult::Dependent

After finishing for all ops it then stably moves all empty entries to the end of the op-log before truncating it.

I have also added three (edit: four) complementary optimizations to avoid some pathological worst cases:

  • It keeps track of the first non-empty entry from the back to avoid fully iterating a long but empty op-log in the case of a 'clear' (which consumes everything) being followed by a 'set'
  • edit: It keeps track of the last non-empty entry from the front to avoid fully iterating a full op-log in the case of all the empty entries being only found towards the back.
  • It keeps track of its most recently encountered empty entry to avoid unnecessarily growing the op-log when inserting.
  • It stops after getting Absorb::MAX_COMPRESS_RANGE TryCompressResult::Independents in a row to avoid iterating the full op-log in case it is filled with predominantly independent ops.

Caveats

  • Even though they are only temporarily empty during WriteHandle::extend, the op-log now contains options instead of values. As ops are basically always enums, enum folding and opt.unwrap_or_else(|| unreachable!()) should(?) remove any overhead.
  • Probably needs more tests than the ones I have already added. (edit: everything should now be covered)
  • Lots of slow but thorough debug_assert!(...)s, which should leave no traces in a release build.

Is it useful?

Frankly, I don't know. I haven't actually used left-right in any serious projects and so can't really judge if this optimization is worthwhile. However, because compression is controlled via an associated const, it should monomorphize away and so at least not degrade performance when disabled, otherwise, it basically shifts writer-side computation from publishing to extending while potentially saving a fair chunk of memory. (Though come to think of it, it should also be quite straightforward to move the compression into WriteHandle::publish, though it then wouldn't be saving any memory, which likely is the main benefit.)


This change is Reviewable

@codecov
Copy link

codecov bot commented Sep 14, 2021

Codecov Report

Merging #100 (f512288) into master (d6d1312) will increase coverage by 7.23%.
The diff coverage is 81.73%.

Impacted Files Coverage Δ
benches/benchmark.rs 0.00% <0.00%> (ø)
src/lib.rs 94.11% <66.66%> (+9.50%) ⬆️
src/write.rs 87.42% <90.17%> (+2.44%) ⬆️
src/utilities.rs 100.00% <100.00%> (ø)

@jonhoo
Copy link
Owner

jonhoo commented Sep 19, 2021

This is super interesting, and I'm not ignoring it, just need to find time to do a proper review!

@Wulf0x67E7
Copy link
Author

No problem. Gives me more time to iron out any kinks I find.

Copy link
Owner

@jonhoo jonhoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really exciting change, and I love all the thought you've put into it. I think what I'd like to see above all is a benchmark to see that this compression actually manifests as a meaningful speedup to writes. I think there is an argument to be made that the compression logic may end up being slower than just doing the extra operations, and would like to see some benchmarks refuting that. You'll probably want knobs to dictate the distance between publishes, the cost of execution an operation, and the cost of attempt to compress two operations.

Separately, this logic is fairly involved, and even though you've done a good job of writing test cases, I'd like to see a quickcheck test or two added as well, just because there are so many corner-cases to consider.

src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/utilities.rs Show resolved Hide resolved
src/utilities.rs Outdated Show resolved Hide resolved
src/write.rs Outdated Show resolved Hide resolved
src/write.rs Outdated Show resolved Hide resolved
src/write.rs Outdated Show resolved Hide resolved
src/write.rs Outdated Show resolved Hide resolved
src/write.rs Show resolved Hide resolved
Copy link
Owner

@jonhoo jonhoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took another look through — I'm really liking the changes so far!

src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/utilities.rs Show resolved Hide resolved
/// and [`TryCompressResult::Dependent`] that they can not be compressed, and `prev` must precede `next`.
///
/// Defaults to [`TryCompressResult::Dependent`], which sub-optimally disables compression.
/// Setting [`Self::MAX_COMPRESS_RANGE`](Absorb::MAX_COMPRESS_RANGE) to or leaving it at it's default of `0` is vastly more efficient for that.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think rustfmt may complain about the line wrapping of this comment.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have vscode configured to automatically run cargo fmt on save and cargo doc doesn't complain about it, so I think it's fine? The single break between 'Defaults...' and 'Setting...' is purely meant to make reading it in an editor easier and should be turned into a simple space on docs.rs.

src/write.rs Outdated Show resolved Hide resolved
src/write.rs Outdated Show resolved Hide resolved
src/write.rs Outdated Show resolved Hide resolved
src/write.rs Show resolved Hide resolved
@@ -0,0 +1,207 @@
use std::collections::{BTreeMap, HashMap};
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you post the results from these once you feel like they're in a decent place?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. Done for today though. In their current state compression with unlimited range against no compression is about 20% faster for btreemap and about 35% faster for hashmap, with limited compression falling somewhere in between. Considering that due to the higher lookup-cost btreemap should be the one benefitting more there definitely is something fishy going on.

@Wulf0x67E7
Copy link
Author

Wulf0x67E7 commented Oct 12, 2021

While working on the benchmarks I just had an epiphany about an even more powerful and general improvement: Custom oplog data structures.

The original idea behind compression was to combine related ops, but the purely linear nature of the oplog meant I was essentially creating an unordered vec-map with linear lookup, which was fine and worthwhile for small maps but comes to a screeching halt for larger ones.

Then why not make the oplog itself a map and allow for custom append logic?

While this idea can be user-implemented by using compression to have a singleton op contain it (which is an excellent use case!), it would be less convoluted to have Absorb have an associated OpLog type and a log_op(OpLog, Op) method and either have absorb_(first/second) take the OpLog instead of single ops or have oplog have (iter/drain)_ops -> Iterator, the latter of which could be made to be backward-compatible but may require an additional OpLog trait and run into issues due to GATs.

I'm going to start another branch, tell me if you want me to open a new PR or merge it into this one. Edit: Wulf0x67E7:custom-log.

Edit:

While implementing it I realized that by replacing WriteHandle::append/extend with WriteHandle::pending -> &mut O you don't even need an associated type if you let O itself act as the OpLog, you just need T::is_empty(&O).

WriteHandle<T, O>::append(O) becomes WriteHandle<T, Vec<O>>::pending().push(O)

@jonhoo
Copy link
Owner

jonhoo commented Oct 24, 2021

I like it — give it a shot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants